Markov Decision Processes with Incomplete Information and Semiuniform Feller Transition Probabilities
نویسندگان
چکیده
This paper deals with control of partially observable discrete-time stochastic systems. It introduces and studies Markov Decision Processes Incomplete Information semiuniform Feller transition probabilities. The important feature these models is that their classic reduction to Completely Observable belief states preserves continuity Under mild assumptions on cost functions, optimal policies exist, optimality equations hold, value iterations converge values for models. In particular, Partially the results this imply new generalize several known sufficient conditions observation probabilities weak states, existence policies, validity defining convergence values.
منابع مشابه
Average Cost Markov Decision Processes with Weakly Continuous Transition Probabilities
This paper presents sufficient conditions for the existence of stationary optimal policies for averagecost Markov Decision Processes with Borel state and action sets and with weakly continuous transition probabilities. The one-step cost functions may be unbounded, and action sets may be noncompact. The main contributions of this paper are: (i) general sufficient conditions for the existence of ...
متن کاملFactored Markov decision processes with Imprecise Transition Probabilities
This paper presents a short survey of the research we have carried out on planning under uncertainty where we consider different forms of imprecision on the probability transition functions. Our main results are on efficient solutions for Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs), a generalization of a Markov Decision Process where the imprecise probabilities are...
متن کاملTime-Average Optimality for Semi-Markov Control Processes with Feller Transition Probabilities
Semi-Markov control processes with Borel state space and Feller transition probabilities are considered. We prove that under fairly general conditions the two expected average costs: the time-average and the ratio-average coincide for stationary policies. Moreover, the optimal stationary policy for the ratio-average cost criterion is also optimal for the time-average cost criterion.
متن کاملLoss Bounds for Uncertain Transition Probabilities in Markov Decision Processes
We analyze losses resulting from uncertain transition probabilities in Markov decision processes with bounded nonnegative rewards. We assume that policies are pre-computed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true transition probabilities. Our approach analyzes the growth of errors incurred by stepping backwa...
متن کاملRepresenting and Solving Factored Markov Decision Processes with Imprecise Probabilities
This paper investigates Factored Markov Decision Processes with Imprecise Probabilities; that is, Markov Decision Processes where transition probabilities are imprecisely specified, and where their specification does not deal directly with states, but rather with factored representations of states. We first define a Factored MDPIP, based on a multilinear formulation for MDPIPs; then we propose ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Siam Journal on Control and Optimization
سال: 2022
ISSN: ['0363-0129', '1095-7138']
DOI: https://doi.org/10.1137/21m1442152